Consolidate Dataclass data update methods - use DIB for update only by XingY · Pull Request #7483 · LabKey/platform

XingY · 2026-03-10T03:06:33Z

Rationale

Related Pull Requests

Changes

Copilot

Pull request overview

This PR consolidates DataClass and Sample update behavior around DataIteratorBuilder (DIB)-based updates, removing LSID as an update/merge key and cleaning up deprecated DataClass storage by dropping the provisioned lsid column via a module upgrade.

Changes:

Remove LSID-as-key support for sample/data update & merge; enforce RowId or Name-based keys and ignore LSID when provided alongside valid keys.
Introduce shared updateRowsUsingPartitionedDIB() logic in AbstractQueryUpdateService and route Sample/DataClass updates through it.
Remove provisioned DataClass lsid column (domain kind + upgrade code + DB scripts) and update integration tests accordingly.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
experiment/src/org/labkey/experiment/samples/AbstractExpFolderImporter.java	Stops setting the removed `UseLsidForUpdate` option during TSV import.
experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java	Routes updates through partitioned DIB; removes LSID-key paths and legacy update methods.
experiment/src/org/labkey/experiment/api/ExpDataImpl.java	Selects DataClass provisioned properties by RowId (not LSID) to support dropped provisioned LSID column.
experiment/src/org/labkey/experiment/api/ExpDataClassDataTableImpl.java	Enforces RowId/Name keys, blocks LSID-only keying, blocks RowId in merge (behind feature), and uses partitioned DIB updates.
experiment/src/org/labkey/experiment/api/DataClassDomainKind.java	Removes `lsid` from provisioned DataClass base properties, indexes, and FKs.
experiment/src/org/labkey/experiment/ExperimentUpgradeCode.java	Adds deferred upgrade to drop provisioned DataClass `lsid` column (and related indices).
experiment/src/org/labkey/experiment/ExperimentModule.java	Bumps schema version to 26.005 and centralizes experimental flag constant usage.
experiment/src/org/labkey/experiment/ExpDataIterators.java	Updates derivation/update logic to use RowId/Name instead of LSID-for-update config; removes LSID update config usage.
experiment/src/client/test/integration/DataClassCrud.ispec.ts	Expands integration coverage for keying behavior (RowId/Name/LSID) and partitioned-duplicate detection.
experiment/resources/schemas/dbscripts/sqlserver/exp-26.004-26.005.sql	Executes Java upgrade to drop provisioned DataClass `lsid` column (SQL Server).
experiment/resources/schemas/dbscripts/postgresql/exp-26.004-26.005.sql	Executes Java upgrade to drop provisioned DataClass `lsid` column (PostgreSQL).
api/src/org/labkey/api/query/DefaultQueryUpdateService.java	Removes conversion-table hook and always uses `getDbTable()` for `convertTypes()` entrypoint.
api/src/org/labkey/api/query/AbstractQueryUpdateService.java	Adds `updateRowsUsingPartitionedDIB()` and small null-iterator safety in `_pump()`.
api/src/org/labkey/api/exp/query/ExpDataTable.java	Reorders enum constants (no functional behavior change expected).
api/src/org/labkey/api/exp/api/ExperimentService.java	Removes `UseLsidForUpdate` query option; adds shared experimental feature constant.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-10T03:15:16Z

experiment/src/client/test/integration/DataClassCrud.ispec.ts

+        // using update from file, verify update using rowId for data that doesn't exist on this datacalss should fail.
+        errorResp = await ExperimentCRUDUtils.importData(server, "RowId\tDescription\n" + row3RowId + "\tupdate\n", emptyDataClass, "UPDATE", topFolderOptions, editorUserOptions);
+        expect(errorResp.text).toContain('Data not found for [' + row3RowId + ']');


Typo in comment: "datacalss" should be "dataclass".

Copilot · 2026-03-10T03:15:17Z

experiment/src/client/test/integration/DataClassCrud.ispec.ts

+        // update date twice specifying the rowId across multiple partitions
+        await server.post('query', 'updateRows', {


Typo in comment: "update date" should be "update data".

Copilot · 2026-03-10T03:15:17Z

experiment/src/org/labkey/experiment/ExperimentModule.java

+        OptionalFeatureService.get().addExperimentalFeatureFlag(ExperimentService.EXPERIMENTAL_FEATURE_ALLOW_ROW_ID_MERGE, "Allow RowId to be accepted when merging samples or dataclass data",
                "If the incoming data includes a RowId column we will allow the column but ignore it's values.", false);


Grammar in this feature description: "ignore it's values" should be "ignore its values".

Copilot · 2026-03-10T03:15:17Z

experiment/src/org/labkey/experiment/ExpDataIterators.java

                di = new SampleUpdateDerivationDataIterator(di, context, _container, _user, _currentDataType, _checkRequiredParents);
            else
-                di = new DataUpdateDerivationDataIterator(di, context, _container, _user, _currentDataType, _checkRequiredParents);
+                di = new DataUpdateDerivationDataIterator(di, context, _container, _user, _currentDataType, _checkRequiredParents);//


Remove the trailing "//" at the end of this line; it looks like leftover debugging text and adds noise to the code.

Copilot · 2026-03-10T03:15:18Z

experiment/src/org/labkey/experiment/ExpDataIterators.java

            if (hasNext)
            {
-                String lsid = (String) get(_lsidCol);
+                String lsid = (String) get(_lsidCol); // why lsid?, insert or merge


Please remove the inline comment "// why lsid?, insert or merge". It reads like a note-to-self and doesn't clarify behavior for future readers.

Copilot · 2026-03-10T03:15:18Z

experiment/src/org/labkey/experiment/api/ExpDataClassDataTableImpl.java

            for (Map<String, Object> dataRow : dataRows.values())
-                lsids.add((String) dataRow.get("lsid"));
+                lsids.add((String) dataRow.get("lsid")); // ?


Remove the "// ?" comment here. If there's an open question about whether LSIDs are always present in these rows, it should be resolved in code (e.g., null-check + clear error) rather than left as an inline question.

Copilot · 2026-03-10T03:15:18Z

experiment/src/client/test/integration/DataClassCrud.ispec.ts

+        // update date twice specifying the name across multiple partitions
        await server.post('query', 'updateRows', {


Typo in comment: "update date" should be "update data".

labkey-adam · 2026-03-10T03:19:51Z

@XingY there's an exp-26.004-26.005.sql script in 26.3 that should be merged to develop shortly.

XingY · 2026-03-10T16:56:49Z

@XingY there's an exp-26.004-26.005.sql script in 26.3 that should be merged to develop shortly.

OK. Thanks for the heads-up.

XingY · 2026-03-23T17:12:40Z

Finding #1 — [Severity: Critical] — SQL Ambiguity — experiment/src/org/labkey/experiment/api/ExpDataClassType.java:87
Issue: After joining the provisioned table (ds) with exp.data (d), the unqualified column reference Name in the SQL SELECT is ambiguous. The provisioned table still has a name column (from BASE_PROPERTIES), and exp.data also has a
Name column. PostgreSQL and SQL Server will both raise an ambiguous column error at runtime.

The generated SQL is effectively:
SELECT ... concat(?, ':', Name) AS Description
FROM expdataclass.storageName ds
JOIN exp.data d ON d.rowId = ds.rowid
WHERE ...
Both ds.name and d.Name exist. On PostgreSQL, unquoted Name normalizes to name, matching both.

Why it matters: Any DataClass with attachment-type properties will cause a SQL error when the attachment parent lookup runs (e.g., during attachment integrity checks, admin pages, or file management), breaking the feature
entirely.
Suggestion: Qualify the column reference: new SQLFragment("ds.Name") or new SQLFragment("d.Name").

Finding #2 — [Severity: High] — Behavioral side effect — api/src/org/labkey/api/dataiterator/StatementDataIterator.java:539-544
Issue: The change to get(int i) now falls through to _data.get(i) when a key column's value is null:
if (null != _keyColumnInfo.get(i))
{
Object value = _keyValues.get(i);
if (value != null)
return value;
}
return _data.get(i);
Previously, if _keyColumnInfo.get(i) was non-null (i.e., it IS a key column), the key value was always returned — even if null. Now, a null key value causes fallthrough to the data source. This change is global to
StatementDataIterator and affects all code paths using it, not just DataClass updates. Any UPDATE/INSERT where a key column legitimately has a null generated value would now silently return the data-source value instead of null,
potentially masking real issues or changing behavior.

Why it matters: This is a shared, low-level data iterator class. A behavioral change here could subtly affect SampleType updates, List updates, or any other table that goes through StatementDataIterator. Bugs would manifest as
incorrect data being written.
Suggestion: Consider scoping this change more narrowly — perhaps only fall through for the LSID column specifically, or add a flag that DataClassUpdateAddColumnsDataIterator sets to opt-in to this behavior. At minimum, add a
comment explaining WHY this change is needed and what invariant it relies on.

Finding #3 — [Severity: Medium] — Incorrect comment — experiment/src/org/labkey/experiment/ExperimentUpgradeCode.java:1002
Issue: The Javadoc says Called from exp-26.004-26.005.sql but the actual SQL scripts are exp-26.005-26.006.sql.
Why it matters: Misleading comments make future debugging and maintenance harder.
Suggestion: Change to Called from exp-26.005-26.006.sql.

Finding #4 — [Severity: Medium] — Wrong import / semantic mismatch — experiment/src/org/labkey/experiment/api/ExpDataClassDataTableImpl.java:1031,1045
Issue: The file imports LSID from ExpMaterialTable.Column (the Sample/Material enum) and uses it in DataClass context:
import static org.labkey.api.exp.query.ExpMaterialTable.Column.LSID;
...
keysCheck.contains(LSID.name())
...
new ValidationException("RowId is not accepted when merging data...", ExpMaterialTable.Column.RowId.name())
While .name() returns "LSID"/"RowId" for both enums so it's functionally correct, using the Material enum in a DataClass context is semantically confusing and could mislead future readers.
Why it matters: Maintainability — a future developer may think this is intentionally referencing Material behavior.
Suggestion: Import from ExpDataTable.Column instead.

Finding #5 — [Severity: Medium] — Debug artifacts left in code — multiple files
Issue: Several WIP/debug comments and artifacts remain:

ExpDataIterators.java:912: di = new DataUpdateDerivationDataIterator(...);// — trailing // comment
ExpDataIterators.java:1183: String lsid = (String) get(_lsidCol); // why lsid?, insert or merge — uncertainty comment
ExpDataClassDataTableImpl.java:1318: lsids.add((String) dataRow.get("lsid")); // ? — question mark comment

Why it matters: These look like personal notes/TODOs that should be resolved before merge.
Suggestion: Resolve the questions these comments raise, then remove or rewrite them as proper documentation.

Finding #6 — [Severity: Medium] — Duplicate detection only catches first duplicate — api/src/org/labkey/api/query/AbstractQueryUpdateService.java:985-1003
Issue: checkPartitionForDuplicates returns immediately after finding the first duplicate:
if (rowId != null && !globalRowIds.add(rowId))
{
errors.addRowError(new ValidationException("Duplicate key provided: " + rowId));
return;
}
This means if there are multiple duplicate keys, only the first is reported. The user must fix one, re-submit, find the next, etc.
Why it matters: Poor UX for bulk operations. Users may need multiple round-trips to fix all duplicates.
Suggestion: Continue checking and report all duplicates in one pass (remove the early return), or accept this as intentional behavior and add a comment explaining why.

Finding #7 — [Severity: Low] — Null configParameters passed to recordDataIteratorUsed — experiment/src/org/labkey/experiment/api/ExpDataClassDataTableImpl.java:1451
Issue:
Map<Enum, Object> finalConfigParameters = configParameters == null ? new HashMap<>() : configParameters;
recordDataIteratorUsed(configParameters); // original null value passed, not finalConfigParameters
If configParameters is null, the original null is passed to recordDataIteratorUsed rather than finalConfigParameters.
Why it matters: If recordDataIteratorUsed tries to read or modify the map, it could NPE. (This is pre-existing in the SampleType code, but worth noting as it was copied.)
Suggestion: Pass finalConfigParameters instead.

Finding #8 — [Severity: Low] — _tableInfo field never used — api/src/org/labkey/api/dataiterator/DataClassUpdateAddColumnsDataIterator.java:38
Issue: The field private final TableInfo _tableInfo is assigned in the constructor but never read anywhere in the class.
Why it matters: Dead code.
Suggestion: Remove the field if it's not needed.

Phase 4: Security

No security issues found. The changes maintain existing authorization patterns — updates go through the standard QueryUpdateService pipeline which checks permissions. The SimpleFilter in
DataClassUpdateAddColumnsDataIterator.prefetchExisting() properly scopes queries by ClassId and Container, preventing cross-container or cross-DataClass data access. No user input is directly concatenated into SQL.

Phase 5: Interactions and Side Effects

StatementDataIterator change (Finding Refactor WebDAV URL generation to make it easier to use in other code… #2) is the highest-risk interaction. This class is used broadly across the platform, and the null-key fallthrough behavior change could affect SampleTypes, Lists, or other update paths.
The SampleTypeUpdateServiceDI refactoring correctly delegates partitioned update logic to the parent class and overrides validatePartitionedRowKeys to call confirmAmountAndUnitsColumns. The deleted coerceTypes method was marked
@deprecated, so its removal is expected.
The updateRow() and _update() methods in ExpDataClassDataTableImpl.DataClassDataUpdateService now throw UnsupportedOperationException. Any caller that reaches these through the non-DIB path (e.g., via super.updateRows() with
oldKeys provided) would crash. The test coverage should verify this doesn't happen, but there could be callers outside the experiment module.
Tests are comprehensive for the new DataClass behavior — they cover LSID rejection, RowId/Name updates, partitioned updates, cross-DataClass validation, and merge restrictions. However, there don't appear to be tests
specifically for the StatementDataIterator change's impact on other entity types.

Overall Assessment

This is a well-structured consolidation that brings DataClass updates in line with the established SampleType DIB pattern and properly eliminates the LSID provisioned column. The most critical issue is the ambiguous Name column in
ExpDataClassType.java (Finding #1), which will cause a SQL error at runtime for any DataClass with attachment properties — this must be fixed before merge. The StatementDataIterator behavioral change (Finding #2) is the second
most concerning item since it modifies shared infrastructure with potentially broad impact; it deserves extra scrutiny and ideally a more targeted approach. The remaining findings are medium/low severity cleanup items that would
improve code quality.

XingY added 3 commits March 9, 2026 19:23

Consolidate Dataclass data update methods - use DIB for update only

e661698

Enable upgrade script

09dd6b0

CRLF

b5e2ac2

XingY requested a review from Copilot March 10, 2026 03:07

Copilot started reviewing on behalf of XingY March 10, 2026 03:08 View session

Copilot AI reviewed Mar 10, 2026

View reviewed changes

XingY added 8 commits March 10, 2026 11:16

Merge remote-tracking branch 'origin/develop' into fb_sourceDIB

dc19ef1

fix build

4d9ff01

merge from develop

a678378

fix merge

a68e8a8

fix merge

9d5aa75

fix update

4893660

Merge remote-tracking branch 'origin/develop' into fb_sourceDIB

4b5e0ef

Merge remote-tracking branch 'origin/develop' into fb_sourceDIB

d3aea2b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidate Dataclass data update methods - use DIB for update only#7483

Consolidate Dataclass data update methods - use DIB for update only#7483
XingY wants to merge 11 commits intodevelopfrom
fb_sourceDIB

XingY commented Mar 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

labkey-adam commented Mar 10, 2026

Uh oh!

XingY commented Mar 10, 2026

Uh oh!

XingY commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		// update date twice specifying the rowId across multiple partitions
		await server.post('query', 'updateRows', {

		OptionalFeatureService.get().addExperimentalFeatureFlag(ExperimentService.EXPERIMENTAL_FEATURE_ALLOW_ROW_ID_MERGE, "Allow RowId to be accepted when merging samples or dataclass data",
		"If the incoming data includes a RowId column we will allow the column but ignore it's values.", false);

		// update date twice specifying the name across multiple partitions
		await server.post('query', 'updateRows', {

Conversation

XingY commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale

Related Pull Requests

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

labkey-adam commented Mar 10, 2026

Uh oh!

XingY commented Mar 10, 2026

Uh oh!

XingY commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

XingY commented Mar 10, 2026 •

edited

Loading